-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Autodesk: [hdSt] Early parallel MaterialX codegen, launched by a scene index #3847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
Filed as internal issue #USD-11537 (This is an automated message. See here for more information.) |
| HdSceneIndexAdapterSceneDelegate::GetMaterialResourceFromSceneIndexPrim( | ||
| HdSceneIndexPrim& prim, const TfTokenVector& renderContexts) | ||
| { | ||
| TRACE_FUNCTION(); | ||
| HF_MALLOC_TAG_FUNCTION(); | ||
| HdSceneIndexPrim prim = _GetInputPrim(id); | ||
|
|
||
| HdMaterialSchema matSchema = HdMaterialSchema::GetFromParent( | ||
| prim.dataSource); | ||
| if (!matSchema.IsDefined()) { | ||
| return VtValue(); | ||
| } | ||
|
|
||
| // Query for a material network to match the requested render contexts | ||
| const TfTokenVector renderContexts = | ||
| GetRenderIndex().GetRenderDelegate()->GetMaterialRenderContexts(); | ||
| HdMaterialNetworkSchema netSchema = matSchema.GetMaterialNetwork(renderContexts); | ||
| HdMaterialNetworkSchema netSchema = | ||
| matSchema.GetMaterialNetwork(renderContexts); | ||
| if (!netSchema.IsDefined()) { | ||
| return VtValue(); | ||
| } | ||
|
|
||
| return VtValue(_ToMaterialNetworkMap(netSchema, renderContexts)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted this static function for code reuse between the old synchronous code path and the new scene index which gets scene index prims without any conversions.
| list(APPEND optionalPrivateClasses | ||
| materialXFilter | ||
| materialXShaderGen | ||
| materialXSyncSceneIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new scene index implementation, which replaces materialXSyncDispatcher in the original PR.
pxr/imaging/hdSt/material.cpp
Outdated
| _hdStMaterialNetwork.ProcessMaterialNetwork(GetId(), hdNetworkMap, | ||
| resourceRegistry.get()); | ||
| fragmentSource = _networkProcessor.GetFragmentCode(); | ||
| volumeSource = _networkProcessor.GetVolumeCode(); | ||
| displacementSource = _networkProcessor.GetDisplacementCode(); | ||
| materialMetadata = _networkProcessor.GetMetadata(); | ||
| materialTag = _networkProcessor.GetMaterialTag(); | ||
| params = _networkProcessor.GetMaterialParams(); | ||
| textureDescriptors = _networkProcessor.GetTextureDescriptors(); | ||
| fragmentSource = _hdStMaterialNetwork.GetFragmentCode(); | ||
| volumeSource = _hdStMaterialNetwork.GetVolumeCode(); | ||
| displacementSource = _hdStMaterialNetwork.GetDisplacementCode(); | ||
| materialMetadata = _hdStMaterialNetwork.GetMetadata(); | ||
| materialTag = _hdStMaterialNetwork.GetMaterialTag(); | ||
| params = _hdStMaterialNetwork.GetMaterialParams(); | ||
| textureDescriptors = _hdStMaterialNetwork.GetTextureDescriptors(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to rename _networkProcessor to _hdStMaterialNetwork. IMHO "Processor" sounds confusing here because the object _hdStMaterialNetwork is simply initialized with those Process methods from hdNetworkMap, it doesn't process some external objects.
| #ifdef PXR_MATERIALX_SUPPORT_ENABLED | ||
| { | ||
| HdStRenderDelegate* stormDelegate = static_cast<HdStRenderDelegate*>( | ||
| sceneDelegate->GetRenderIndex().GetRenderDelegate()); | ||
|
|
||
| if (HdSt_MaterialXSyncSceneIndex* sceneIndex = | ||
| stormDelegate->GetMaterialXSyncSceneIndex()) { | ||
|
|
||
| sceneIndex->Wait(); | ||
| } | ||
| } | ||
| #endif |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| HdMaterialNode2 const* | ||
| HdSt_GetTerminalNode( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now need to call it from outside this source file.
| return; | ||
| } | ||
|
|
||
| ProcessFilterTask(materialId, filterTask, isVolume, resourceRegistry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ProcessMaterialNetwork is now implemented via ProcessFilterTask for code reuse.
| auto filterTask = std::make_shared<HdSt_MaterialFilterTask>(); | ||
| filterTask->hdNetwork = | ||
| HdConvertToHdMaterialNetwork2(hdNetworkMap, &isVolume); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state necessary for the codegen process is now encapsulated in the HdSt_MaterialFilterTask class, because it needs to be created earlier than the material sprim and to persist until the parallel codegen task is finished.
HdStMaterialNetwork::ProcessMaterialNetwork is called in the old, synchronous code path, but we create a HdSt_MaterialFilterTask here in order to share code between the two code paths.
| void | ||
| HdSt_MaterialFilterTask::AddFallbackDomeLightTextureNode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and a few other functions are now methods of HdSt_MaterialFilterTask.
| const SdfPath domeTexturePath = | ||
| terminalNodePath.ReplaceName(_tokens->domeLightFallback); | ||
| hdNetwork->nodes.insert({domeTexturePath, hdDomeTextureNode}); | ||
| hdNetwork.nodes.insert({domeTexturePath, hdDomeTextureNode}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and a few other related pieces of data are now encapsulated as members of HdSt_MaterialFilterTask.
| size_t | ||
| HdSt_MaterialFilterTask::BuildAnonymizedMaterialNetwork( | ||
| HdMaterialNetwork2* anonNetwork) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unified the terminology around "anonymized". The input parameters are replaced by HdSt_MaterialFilterTask's members.
| static mx::ShaderPtr | ||
| _GenerateMaterialXShader( | ||
| HdMaterialNetwork2 const& hdNetwork, | ||
| HdSt_MaterialXGeneratorTask::HdSt_MaterialXGeneratorTask( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a different kind of "task" - the necessary data passed to the lambda executing in parallel in TBB and wrapping the MaterialX codegen.
| HdSt_MaterialXGeneratorTask generatorTask( | ||
| filterTask, | ||
| materialPath, | ||
| *resourceRegistry->GetHgi()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating the generator task on the stack in the case when parallel codegen is off.
| return std::make_unique<HdSt_MaterialXGeneratorTask>( | ||
| filterTask, | ||
| materialPath, | ||
| hgi); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating the generator task on the heap for parallel codegen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One important change from the original implementation is that the generator task now gets a shared pointer to the filter task and holds on to it. The generation process reads some material network data which is part of the filter task, so it's important to ensure that it's not destroyed before codegen is complete. In the original implementation, the sprim held ownership of the filter task for the duration of codegen.
| @@ -0,0 +1,180 @@ | |||
| // | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaces pxr/imaging/hdSt/materialXSyncDispatcher.cpp in the original implementation
| if (resourceRegistry->ContainsMaterialXShader( | ||
| generatorTask->GetShaderHash())) { | ||
| // We already have a shader for this topology. | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First check that the shader hasn't been registered before the early parallel init. This is a new method which, unlike RegisterMaterialXShader, doesn't add a new instance to the registry if an existing one isn't found.
Reusing RegisterMaterialXShader in combination with IsFirstInstance would be problematic because the intended usage pattern, in the case when there's no existing value, seems to be:
- Insert a
nullptrvalue and return a registry instance pointing to that value and holding a lock to the whole MaterialX registry. - The client code is supposed to populate the instance with a non-
nullptrvalue while holding the lock. The instance'sIsFirstInstancemethod returnstrueif the value isnullptr.
So the above pattern can't be used for parallel tasks, when the uniqueness check has to happen before launching the task, the non-nullptr value is known only when the task completes and the lock has to be released ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
| // Use a separate concurrent container for tracking unique generator tasks. | ||
| // The generated shader will be registered in the resource registry when | ||
| // the task completes. | ||
| auto insertResult = | ||
| _generatorTaskSet.insert(generatorTask->GetShaderHash()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another uniqueness check, this time between the different parallel tasks, using a dedicated data structure.
| #ifdef PXR_MATERIALX_SUPPORT_ENABLED | ||
| TF_DEFINE_ENV_SETTING(HDST_ENABLE_PARALLEL_MTLX_CODEGEN, false, | ||
| "Enable early parallelized MaterialX codegen"); | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this setting definition here for better encapsulation.
| void | ||
| HdStRenderDelegate::SetTerminalSceneIndex( | ||
| const HdSceneIndexBaseRefPtr &terminalSceneIndex) | ||
| { | ||
| if (!TfGetEnvSetting(HDST_ENABLE_PARALLEL_MTLX_CODEGEN)) { | ||
| return; | ||
| } | ||
|
|
||
| _materialXSyncSceneIndex = HdSt_MaterialXSyncSceneIndexRefPtr( | ||
| new HdSt_MaterialXSyncSceneIndex(terminalSceneIndex, *this)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Injecting our scene index as the very last in the chain, and only if the optimization is enabled.
| HdSt_MaterialXSyncSceneIndex* | ||
| HdStRenderDelegate::GetMaterialXSyncSceneIndex() | ||
| { | ||
| return get_pointer(_materialXSyncSceneIndex); | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently only used in HdStMaterial::Sync to wait for the parallel codegen, but could be used in the future to reuse some per-material state.
| bool | ||
| HdStResourceRegistry::ContainsMaterialXShader( | ||
| HdInstance<MaterialX::ShaderPtr>::ID id) | ||
| { | ||
| bool found = false; | ||
| _materialXShaderRegistry.FindInstance(id, &found); | ||
| return found; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new method just wraps existing functionality.
|
FYI @tcauchois |
…llel_mtlx_codegen_si
| // Wait for all early parallel codegen tasks to complete and | ||
| // retrieve the state cached for this sprim when codegen was | ||
| // started | ||
| filterTask = sceneIndex->WaitAndExtractFilterTask(GetId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait for the codegen tasks to complete. Previously, we were waiting by calling a new render delegate API before syncing the materials: https://github.com/PixarAnimationStudios/OpenUSD/pull/3567/files#r1995705092
Now, we wait for the generator task group in each material's sync. In practice, only the first material sprim would ever wait for any measurable duration, and the overhead of waiting for all subsequent sprims is negligible (around 1 microsecond).
This is also where we get the cached per-material state from the scene index. The filter task is removed from the scene index its ownership is transferred to the local variable.
Reusing the filter task allows us to avoid getting the material resource again below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible instead to have the material prim datasource call wait on the scene index, so that Storm doesn't need to deal with the new scene index at all? I'll try to sketch out a proposal in the scene index code.
| _hdStMaterialNetwork.ProcessMaterialNetwork(GetId(), | ||
| hdNetworkMap, resourceRegistry.get()); | ||
|
|
||
| processedMaterialNetwork = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the material network has processed the filter task, we don't need the task any more, so we destroy it by letting it go out of scope.
| HD_TRACE_FUNCTION(); | ||
|
|
||
| for (const HdSceneIndexObserver::DirtiedPrimEntry& entry : entries) { | ||
| // If the dirtied prim is a material for which we've cached a filter | ||
| // task, remove the task since it's now stale. Any ongoing generator | ||
| // tasks will keep their own shared pointers to the respective filter | ||
| // tasks, so this is safe. If the element is not in the map, then the | ||
| // lookup won't involve locking. | ||
| _FilterTaskMap::accessor accessor; | ||
| if (_filterTaskMap.find(accessor, entry.primPath)) { | ||
| _filterTaskMap.erase(accessor); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting the filter task on dirtying.
| { | ||
| _FilterTaskMap::accessor accessor; | ||
| _filterTaskMap.insert(accessor, id); | ||
| accessor->second = generatorTask->GetFilterTask(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching the filter task in the scene index.
FYI @tcauchois @klucknav I've pushed another change to this PR reintroducing the reuse of filter tasks in This addresses the other major feedback item from #3567. To recap, filter tasks can become stale if the material changes after parallel codegen started but before the sprim is synced. Animated materials is the only known case of this, and USD has tests for animated materials. This should be the last change in this branch. |
tcauchois
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Pavlo, this is great progress! Moving the task launching to Hydra 2 PrimsAdded looks like it cut out a bunch of the hd API changes, and it should hopefully be letting you launch the tasks slightly earlier as well!
I feel like if we fold the WaitAndExtractFilterTask call into some kind of datasource access call inside MaterialXSyncSceneIndex we can additionally get rid of some of the new render delegate API in hdSt/renderDelegate.h, and make the scene index more modular/reusable, which I think a bunch of folks would appreciate past just the Storm team!
Happy to expand on my thoughts in the checkin tomorrow or on slack but this is definitely a good direction.
| // Wait for all early parallel codegen tasks to complete and | ||
| // retrieve the state cached for this sprim when codegen was | ||
| // started | ||
| filterTask = sceneIndex->WaitAndExtractFilterTask(GetId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible instead to have the material prim datasource call wait on the scene index, so that Storm doesn't need to deal with the new scene index at all? I'll try to sketch out a proposal in the scene index code.
| SdfPath surfTerminalPath; | ||
| if (HdMaterialNode2 const* surfTerminal = | ||
| _GetTerminalNode(surfaceNetwork, terminalName, &surfTerminalPath)) { | ||
| filterTask->terminalNode = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here & material.*, instead of bringing the "filterTask" API into a bunch of files in Storm, you could modularize things better by either:
(1) having the scene index produce a "glsl" material network that can be read by the non-materialx code; or
(2) having the scene index produce a digested material network as code snippets for each stage & params, and pass them through as datasources. If material.cpp finds that, it uses it, and otherwise it falls back to network processing.
This splits the code up a little better, which is nice since the MaterialX code is an external dependency with an evolving API and hidden behind a build flag, so having a big API interface between the two makes me a bit worried.
| using HdSt_MaterialFilterTaskSharedPtr = | ||
| std::shared_ptr<HdSt_MaterialFilterTask>; | ||
|
|
||
| extern HdMaterialNode2 const* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this needs an extern, but if you want it to be publicly visible (which maybe we don't care?) you could throw an HDST_API on it...
| /// synchronously, or on the heap, owned by the respective Sprim, if the | ||
| /// codegen happens in parallel tasks. | ||
| /// | ||
| struct ARCH_EXPORT_TYPE HdSt_MaterialFilterTask final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an implementation detail of the MaterialX code, and I don't think this belongs in hdSt/materialNetwork.h. Also curious why we need ARCH_EXPORT_TYPE here.
| HdSt_MaterialXGeneratorTask( | ||
| HdSt_MaterialFilterTaskSharedPtr filterTask, | ||
| SdfPath const& materialPath, | ||
| Hgi const& hgi); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you're improving on the status quo here (which is to pass in resource registry and pull hgi out of that), but it would be cleaner to just pass in the relevant info, namely bindless support and shading language.
| if (resourceRegistry->ContainsMaterialXShader( | ||
| generatorTask->GetShaderHash())) { | ||
| // We already have a shader for this topology. | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
| HdSceneIndexPrim | ||
| HdSt_MaterialXSyncSceneIndex::GetPrim(const SdfPath &primPath) const | ||
| { | ||
| // Just forward to input scene index - we don't modify prim data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was envisioning that you call WaitAndExtractFilterTask here, either when you call GetPrim() on a materialx prim, or potentially when you call primDataSource->Get("material"). That way, you don't need all of the complicated logic in renderDelegate.h to pass the scene index pointer through.
|
|
||
| #ifdef PXR_MATERIALX_SUPPORT_ENABLED | ||
| void | ||
| HdStRenderDelegate::SetTerminalSceneIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the scene index, this bit breaks a bunch of encapsulation. We need GetMaterialXSyncSceneIndex so we can call WaitAndExtractFilterTask in material.cpp, but if we do that in the scene index GetPrim() or a datasource instead we don't need this call anymore. Meanwhile, rather than grafting the new scene index onto the terminal scene index like this, the more flexible/preferred way is to register a scene index plugin that inserts it somewhere relative to other scene index filters; it's a small bit of boilerplate you can grab from e.g. the dependencySceneIndexPlugin.* or something.
Then the only issue is your scene index's use of the render delegate, but it shouldn't really be getting a pointer to the render delegate anyway. It does need a pointer to a registry of MaterialX codegen results, but we could either find a lower key way to pass in the resource registry, or just have a local registry on the scene index instead.
Description of Change(s)
This is a refactoring of #3567 incorporating the code review feedback, namely:
HdDirtyBits-based mechanism to check if the filter task cached when the codegen was launched can be reused during material sync. Instead, we always derive the necessary data again, which is a minor overhead relative to the codegen process itself. This can be further optimized in the future using an alternative mechanism, e.g. based on the change tracker as suggested in the original code review.Link to proposal (if applicable)
Fixes Issue(s)
Checklist
I have created this PR based on the dev branch
I have followed the coding conventions
I have added unit tests that exercise this functionality (Reference:
testing guidelines)
I have verified that all unit tests pass with the proposed changes
I have submitted a signed Contributor License Agreement (Reference:
Contributor License Agreement instructions)